Skip to content

✨ switch to Huber Regression with L2 normalization for sliding window method#44

Merged
sambra95 merged 6 commits intomainfrom
switch_to_huber_reg
Feb 27, 2026
Merged

✨ switch to Huber Regression with L2 normalization for sliding window method#44
sambra95 merged 6 commits intomainfrom
switch_to_huber_reg

Conversation

@enryH
Copy link
Collaborator

@enryH enryH commented Feb 25, 2026

before (Theil-Sen):

{ 'sliding_window': { 'model_type': 'sliding_window',
                      'params': { 'fit_t_max': 36.6,
                                  'fit_t_min': 35.4,
                                  'intercept': -4.663280857018275,
                                  'slope': 0.0778949612485742,
                                  'time_at_umax': 36.0,
                                  'window_points': 7}}}

now (HuberRegressor):

{ 'sliding_window': { 'model_type': 'sliding_window',
                      'params': { 'fit_t_max': 36.8,
                                  'fit_t_min': 35.6,
                                  'intercept': -4.663248126266753,
                                  'slope': 0.07789418105471918,
                                  'time_at_umax': 36.2,
                                  'window_points': 7}}}

The ladder is roughly twice as fast.

Need to fix in a separate PR a regression in plotting.ipynb tutorial

@enryH enryH requested review from Copilot and sambra95 February 27, 2026 15:25
@enryH enryH marked this pull request as ready for review February 27, 2026 15:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request switches the sliding window method from Theil-Sen regression to HuberRegressor for approximately 2x performance improvement while maintaining similar accuracy. Additionally, it migrates dependency management from requirements.txt to direct specification in pyproject.toml, which is a more modern and streamlined approach.

Changes:

  • Replaced scipy's theilslopes with scikit-learn's HuberRegressor in the sliding window fitting algorithm
  • Migrated dependencies from requirements.txt to pyproject.toml's dependencies field
  • Added scikit-learn as a new dependency

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 6 comments.

File Description
src/growthcurves/non_parametric.py Replaced Theil-Sen estimator with HuberRegressor for robust linear regression in sliding window method; added sklearn import
requirements.txt Removed file entirely as dependencies now specified in pyproject.toml
pyproject.toml Added direct dependency specification including new scikit-learn dependency; commented out requirements.txt loading

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +94 to +98
# # Use Theil-Sen estimator for robust line fitting
# result = theilslopes(y_log_win, t_win)
# slope, intercept = result.slope, result.intercept
# # Use HuberRegressor which uses L2 regularization and is twice as fast as
# # Theil-Sen.
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has a double hash '# #' which appears to be a formatting issue. This should be cleaned up to use a single '#' for consistency with standard Python comment formatting.

Copilot uses AI. Check for mistakes.
Comment on lines +94 to +96
# # Use Theil-Sen estimator for robust line fitting
# result = theilslopes(y_log_win, t_win)
# slope, intercept = result.slope, result.intercept
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out code for Theil-Sen should be removed rather than left in place. Since this is a deliberate switch to HuberRegressor and the PR description indicates it's roughly twice as fast with similar results, the old code should be deleted to keep the codebase clean.

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +98
# # Use HuberRegressor which uses L2 regularization and is twice as fast as
# # Theil-Sen.
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "uses L2 regularization" but this is inaccurate. HuberRegressor uses L2 regularization as a penalty on the coefficients (controlled by the alpha parameter), but the main feature is that it uses the Huber loss function for robustness to outliers, not L2 regularization. The comment should clarify that HuberRegressor is robust to outliers through the Huber loss function.

Suggested change
# # Use HuberRegressor which uses L2 regularization and is twice as fast as
# # Theil-Sen.
# # Use HuberRegressor, which is robust to outliers via the Huber loss
# # and is typically faster than Theil-Sen.

Copilot uses AI. Check for mistakes.
from logging import getLogger

import numpy as np
import sklearn.linear_model
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import style import sklearn.linear_model is inconsistent with the rest of the codebase which uses from <package>.<module> import <class> pattern (e.g., from scipy.interpolate import make_smoothing_spline). Consider using from sklearn.linear_model import HuberRegressor for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +19
# from scipy.stats import theilslopes


Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented-out import from scipy.stats import theilslopes should be removed entirely as it's no longer needed after switching to HuberRegressor.

Suggested change
# from scipy.stats import theilslopes

Copilot uses AI. Check for mistakes.
Comment on lines 25 to +29
# use requirements.txt instead of pyproject.toml for dependencies
# https://stackoverflow.com/a/73600610/9684872
# ! uncomment also dependencies in the dynamic section above
[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}
# [tool.setuptools.dynamic]
# dependencies = {file = ["requirements.txt"]}
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The outdated comments from the previous requirements.txt approach should be removed. These comments (lines 25-27, 28-29) reference the old approach and are now misleading since dependencies are now directly defined in the dependencies list.

Copilot uses AI. Check for mistakes.
@sambra95 sambra95 merged commit c6c554d into main Feb 27, 2026
11 checks passed
@sambra95 sambra95 deleted the switch_to_huber_reg branch February 27, 2026 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants